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FIG. 1A. 

Sequence length 2175 

CACGC6TCCGCAAATTTCCTGATTCTTTTGAATTAGGATTCCAGATGGGGGCCTCATTTCTACAGCCCCCAACATTCCT 

ATAGCCGTTATCACT6CCATCACCACT6CCACCAGCATCTTCTTGCAGATTCCACCCCTGCTCCM 

mGAAAGTGAGCAGAAAGGAAGCTCTCAGAAAAATCTCTAGTGGTGGCTGCCGTCGCTC^^ 

M G y L r L K V L L A G V S F S G 17 

CTTCACCACC ATG 6GC TGG CTT TTT CTA AAG GTT TTG TTG GCG GGA GTG AGT TTC TCA GGA 51 

F LYPLVDFCISGKTRGQKPN 37 

TTT CTT TAT CCT CTT GTG GAT TTT TGC ATC AGT GGG AAA ACA AGA GGA CAG AAG CCA AAC 111 

rVIILA DDMGVGDLGAN.WAE 57 

TH GTG ATT ATT HG GCC GAT GAC ATG GGG TGG GGT GAC CTG GGA GCA fiAC TGG GCA GAA 171 

T K D T /V N L D K M A S E G M R F V D F 77 

ACA AAG GAC ACT GCC AAC CTT GAT AAG ATG GCT TCG GAG GGA ATG AG6 TTT GTG GAT TTC 231 

HAAASTCSPSRASLLTGRLG 97 

CAT GCA GCT GCC TCC ACC TGC TCA CCC TCC CGG GCT TCC TTG CTC ACC GGC CGG CTT GGC 291 

LRNGVTRNFAVTSVGGLPLN 117 

CTT CGC AAT GGA GTC ACA CGC AAC TTT GCA GTC ACT TCT GTG GGA GGC CTT CCG CTC AAC 351 

E T T L A E V L Q Q A G Y V T G I I G K 137 

GAG ACC ACC TTG GCA GAG GTG CTG CAG CAG GCG GGT TAC GTC ACT GGG ATA ATA GGC AAA 411 

VHLGHHGSYHPNFRGFDYYF 157 

TGG CAT CTT GGA CAC CAC GGC TCT TAT CAC CCC AAC TTC CGT GGT TTT GAT TAC TAC TH 471 

G I P Y S H D M G C T D T P G Y N H P P 177 

GGA ATC CCA TAT AGC CAT GAT ATG GGC TGT ACT GAT ACT CCA GGC TAC AAC CAC CCT CCT 531 

CPACP QGD6PSRNLQRDCYT 197 

TGT CCA GCG TGT CCA CAG GGT GAT GGA CCA TCA AGG AAC CH CAA AGA GAC TGT TAC ACT 591 

DVALPLYENLNIYEQPVNLS 217 

GAC GTG GCC CTC CCT CH TAT GAA AAC CTC AAC ATT GTG GAG CAG CCG GTG AAC TTG AGC 651 

S L A Q K Y A E K A T Q F I Q R A S T S 237 

AGC CTT GCC CAG AAG TAT GCT GAG AAA GCA ACC CAG HC ATC CAG CGT GCA AGC ACC AGC 71 1 

G R P F L L Y V A L A H M H V P L P V T 257 

GGG AGG CCC TTC CTG CTC TAT GTG GCT CTG GCC CAC ATG CAC GTG CCC TTA CCC GTG ACT 771 

QLPAAPRGRSLYGAGiyEMD 277 

CAG CTA CCA GCA GCG CCA CGG GGC AGA AGC CTG TAT GGT GCA GGG CTC TGG GAG ATG GAC 831 

S L V G Q I K D K V D H T V K E N T F L 297 

AGT CTG GTG GGC CAG ATC AAG GAC AAA GTT GAC CAC ACA GTG AAG GAA AAC ACA TTC CTC 891 

UFTGDNGPWAQKCELAGSVG 317 

TGG TTT ACA GGA GAC AAT GGC CCG TGG GCT CAG AAG TGT GAG CTA GCG GGC AGT GTG GGT 951 

P F T G F W Q T R Q G G S P A K Q T T V 337 

CCC TTC ACT GGA TTT TGG CAA ACT CGT CAA GGG GGA AGT CCA GCC AAG CAG ACG ACC TGG 1011 

E G G H R V P A L A Y W P G R V P V N V 357 
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GAA GGA GGC CAC CGG GTC CCA GCA CTC CCA GCA CTG GCT TAC TGG CCT GGC AGA Gn CCA Gn AAT GTC 1071 



TSTALLSVLDIFP 
ACC AGC ACT GGC HG TTA AGO GTG CTG GAC ATT TTT CCA 

ASLPQGRRFDGVD 
GCC AGC m CCT CAA GGA CGG CGG TTT GAT GGT GTG GAC 

RSQPGHRVLFHPN 
CGG TCA CA6 CCT 6GG CAC AGG GTG CTG nC CAC CCC AAC 



G A 

GGA GCC 

R A 
AGG GCT 

L E 
CTG GAA 

Y L 
GTG CTG 

I S 
ATC TCC 



LQT VRLERYKA 
CTG CAG ACT GTC C6C CTG GAG CGT TAC AAG GCC 

CDGSTGPELQH 
TGT GAT G6G AGC ACG GGG CCT GAG CTG CAG CAT 

DDTAEAVPLER 
GAC GAT ACC GCA GAA GCT GTG CCC CTA GAA AGA 

PEVRKVLALVL 
CCC GAG GTC AGA AAG GTT CTT GCA GAC GTC CTC 

S A D Y T Q D P S V T 
AGC GCA GAT TAC ACT CAG GAC CCT TCA 6TA ACT 



T V V A L A Q 
ACT GTG GTA GCC CTG GCC CAG 

V S E V L F G 
GTC TCC GAG GTG CTC TH GGC 

S G A A G E F 
AGC GGG GCA GCT GGA GAG TH 

F Y r"T GGA 
nC TAC ATT ACC GGT GGA GCC 

K F P L I F N 
AAG m CCT CTG ATT TTC AAC 

G G A E Y Q A 
GGT GGT GCG GAG TAC CAG GCT 

Q D I A N D N 
CAA GAC ATT GCC AAC GAC AAC 

P C C H P Y Q 
CCC TGC TGT AAT CCC TAC CAA 



lACRCQAAi 
AH GCC TGC CGC TGT CAA GCC GCA TAA 

CAGACCAAnTHAnCCACGAGGAGGAGTACCTGGAAATTAGGCAAGTTTGCnCCAAATTTCATTTnACCCTCnT 
ACAAACACACGCTHAGTnAGTCnGGAGTnAGTnTGGAGTTAGCCnGCATATCCCTTCTGTAJCCTGTCCCTCC 
TCCACGCCGACCCGAGAGCAGCTGAGCTGCGCTGGCTCTGGGCACCCAGTGTGCCTTAATGGGAAGCACAO^^^ 
GAGTCAQGCACAQGTGCCAGCTCCAQCTnTGAACnGGGCAAnGTnAACCTAACCTGCAAGnGAnnGAGGGTT 



377 
1131 

397 
1191 

417 
1251 

437 
1311 

457 
1371 

477 
1431 

497 
1491 

517 
1551 

526 
1578 



FIG. 1B 
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FIG. 4. 

Prosite Pattern Matches 
ProsHBveiSion: Release 112 of Febtuaiy 1995 
>PS00001/ PDOC0()001/ASN_GLYCOSYLA'nON N-glycosylation site. 

Query. 117 NETT 120 
Query: 215 NLSS 218 
Query. 356 NVTS 359 
Query 497 NISS 500 

>PS0()005/ P[X)C00005/PKC_PHOSPHOlSITE Protelri idnase C phosphorylation site. 

Query. 28 S6K 30 
Query 93 TGR 95 
Query 237 SGR 239 
Query 290 TVK 292 
Query 422 TVR 424 

>PS00006/ PDOC00006/CK2_PHOSPHOJSITE Casein kinase II phosphorylation site. 

Query 120 TLAE 123 

Query 290 TVKE 293 

Query 335 TTWE 338 

Query 364 SVLD 367 

Query 444 TGPE 447 

Query 499 SSAD 502 
>^OOgg8/PD(XX)0008fMYRISm Nwn^ 

Query 12 GVSFSG 17 
Query 33 GQKPNF 38 
Query 52 GANWAE 57 
Query 97 GLRNGV 102 
Query 113 GLPLNE 118 
Query 158 GIPYSH 163 
Query 328 GGSPAK 333 
Query 388 GVDVSE 393 
Query 418 GALQW 423 
Query 435 6GARAC440 
>PS00009/ PDOC00009/AMIDATION Amidation site. 

Query 382 QGRR 385 

>PS00149/ PDOC00117/SUUATASE_2 Sulfafases signature 2. 

Query 129 GYVTGIIGKW 138 
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FIG 5A 

Input f I le Fbh23553f I. seqj Dutpirt Fl le 23553. trans i iw- s^r-v. 

Sequence length 4321 

CCCACGCGTCCffiCTAATGAATCTTGGGGCCGGTGTCGGGaOiGGCGGCTTGATCGGCM^ 

AGAGGCCAGGAGCGAGGGCAGCGAGGATCAGAGGCCAGGCCTTIXCGGCTGCC^^ 

GAGGAACAT6ACTCT(XCCCTTCQGAGGAGGAA6GAAGTC(mTGCC^ 

TTCCCAGAGCTTTTTCTCTAGAGAAGATmGAAGGimnnGTGCTGACGGCCACCC^^ 

AAACTTGGCAAATGACATGCAGGncnCAAGGCAGAATAATTGCAGAAAATCnCAA^^ 

CTGAATACCTCTGAGAATAGAGAnGATTAnCAACCAGGATACCTAAnCAAGAACTCCAG^^ 

MKYSCCALVLA 11 

CATnTGTCAGTTnGCAACATTQGACCAAATACA ATG AAG TAT TCT TGC TGT GCT CTG GH HG OCT 33 

VLGTELLGSLCSTVR S^T R F R 31 

GTC CTG G6C ACA GAA HG CTG G6A A6C CTC TGT TCG ACT GTC AGA TCC CC6 AGG TTC AGA 93 

G R I Q Q E R K N I R P N I I L V L T D 51 

GGA CGG ATA CAG CA6 GAA CGA AAA AAC ATC CGA CCC AAC ATT ATT CTT GTG CTT ACC GAT 153 

DQDYELGSLQVMNKTRKIME 71 

GATCAA GAT GTG GAG CTG 6GG TCC CTG CAA GTC ATG AAC AAA ACG AGA AAG ATT ATG GAA 213 

H G G A T F I N A F V T T P M C C P S R 91 

CAT GGG GGG GCC ACC TTC ATC AAT GCC TTT GTG ACT ACA CCC ATG TGC TGC CCG TCA CGG 273 

S S M L T G K Y V H N H N V Y T N N E N 111 

TCC TCC ATG CTC ACC GGG AAG TAT GTG CAC AAT CAC AAT GTC TAC ACC AAC AAC GAG AAC 333 

C S S P S W Q A H H E P R T F A V Y L N 131 

TGC TCT Ttt CCC TCG TGG CAG GCC ATG CAT GAG CCT CGG ACT Tn GCT GTA TAT CTT AAC 393 

N T r, Y R T A F F G K Y L N E Y N G S Y 151 

AAC ACT GGC TAC AGA ACA GCC TH TH GGA AAA TAC CTC AAT GAA TAT AAT G6C AGC TAC 453 

I P P G W R E W L G L I K N S R F Y N Y 171 

ATC CCC CCT GGG TGG CGA GAA TGG CH GGA HA ATC AAG AAT TCT CGC HC TAT AAT TAC 513 

T V C R N G I K E K H G F 1) Y A K D Y F 191 

ACT GTT TGT CGC AAT GGC ATC AAA GAA AAG CAT GGA TH GAT TAT GCA AAG GAC TAC HC 573 

T D L I T N E S I N Y F K M S K R H Y P 211 

ACA GAC TTA ATC ACT AAC GAG AGC ATT AAT TAC TTC AAA ATG TCT AAG AGA ATG TAT CCC 633 

H R P V M M V I S H A A P H G P E D S A 23 1 

CAT AGG CCC GTT ATG ATG GTG ATC AGC CAC GCT GCG CCC CAC GGC CCC GAG GAC TAC GCC 693 

P Q F S K L Y P N A S Q H I T P S Y N Y 251 

CCA CAG m TCT AAA CTG TAC CCC AAT GCT TCC CAA CAC ATA ACT CCT AGT TAT AAC TAT 753 

A P N M D K H V I M Q Y T G P M L P I H 271 

GCA CCA AAT ATG GAT AAA CAC TGG ATT ATG CAG TAC ACA GGA CCA ATG CTG CCC ATC CAC 813 

M E F T N I L Q R K R L Q T L M S V D D 291 

ATG GAA TTT ACA AAC ATT CTA CAG CGC AAA AGG CTC CAG ACT TTG ATG TCA GTG GAT GAT 873 

S V E R L Y N M L V E T 6 E L E N T Y I 311 
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FIG. 5B. 



TCT GTG GAG AGG CTG TAT AAC ATG CTC GTG GAG ACG G6G GAG CTG GAG AAT ACT TAC ATC 933 

I Y T A B H G Y H I H Q F G L V K G K S 331 
ATT TAC ACC GCC GAC CAT GGT TAC CAT ATT GGG CAG TTT GCA CTG GTC AA6 GGG AAA TCC 993 
H P Y D F n I R V P F F I R G P S Y E P 351 
ATG CCA TAT GAC HT GAT ATT CGT GTG CCT TH HT ATT CGT GGT CCA AGT GTA GAA CCA 1053 
GSIVPQIVLNIDLA PTI L D 1 371 
GGA TCA ATA GTC CCA CAG ATC GH CTC AAC AH GAC TTG GCC CCC ACG ATC CTG GAT AH 1113 
A G L B T P P D V D G K S V L K L L D P 391 
GCTGGG CTC GAC ACA CCT CCT GAT GTG GAC G6C AAG TCT GTC CTC AAA CTT CTG GAC CCA 1173 
E K P G N R F R T N K K A K I D T F 411 

GAA AAG CCA GGT AAC AGG TH CGA ACA AAC AAG AAG GCC AAA ATT TG6 CGT GAT ACA HC 1233 
LVERGKFLRKKEESSKNIQQ 431 
CTA GTG GAA AGA GGC AAA TTT CTA CGT AAG AAG GAA GAA TCC AGC AAG AAT ATC CAA CAG 1293 
SNHLPKYERVKELCQQARYQ 451 
TCA AAT CAC HG CCC AAA TAT GAA CGG GTC AAA GAA CTA TGG CAG CAG GCC AGG TAC CAG 1353 
TA CEQPGQKWQCIEDTSGKL 471 
ACA GCC TGT GAA CAA CCG GGG CAG AAG TGG CAA TGC ATT GAG GAT ACA TCT GGC AAG CTT 1413 
R I H K C K G P S D L L T V R Q S T R N 491 
CGA AH CAC AAG TGT AAA GGA CCC AGT GAC CTG CTC ACA GTC CGG CAG AGC ACG COG AAC 1473 
LYARGFHDKDKECSCRESGY 511 
CTC TAC GCT CGC GGC HC CAT GAC AAA GAC AAA GAG TGC AGT TGT AGG GAG TCT GGT TAC 1533 
RASRSQRKSQRQFLRNQGTP 531 
CGT GCC AGC AGA AGC CAA AGA AAG AGT CAA CGG CAA TTC TTG AGA AAC CAG GGG ACT CCA 1593 
KYKPR FVHTRQT RSLSVEFE 551 
AAG TAC AAG CCC AGA m GTC CAT ACT CGG CAG ACA CGT TCC TTG TCC GTC GAA TH GAA 1653 
G E I Y D I N L E E E E E L Q V L Q P R 571 
GGT GAA ATA TAT GAC ATA AAT CTG GAA GAA GAA GAA GAA TTG CAA GTG HG CAA CCA AGA 1713 
NIAKRHDEGHKGPRDLQASS 591 
AAC AH GCT AAG CGT CAT GAT GAA GGC CAC AAG GGG CCA AGA GAT CTC CAG GCT TCC AGT 1773 
GGNRGRMLADSSNAYGPPTT 611 
GGT GGC AAC AGG GGC AGG ATG CTG GCA GAT AGC AGC AAC GCC GTG GGC CCA CCT ACC ACT 1833 
YRVTHKCFILPNDSIHCERE 631 
GTC CGA GTG ACA CAC AAG TGT TTT ATT CTT CCC AAT GAC TCT ATC CAT TGT GAG AGA GAA 1893 
L Y Q S A R A W K D H K A Y I D K E I E 65 1 
CTG TAC CAA TCG GCC AGA GCG TGG AAG GAC CAT AAG GCA TAC AH GAC AAA GAG AH GAA 1953 
A L Q D KIKNI RFVRGHLK R R K 671 
GCT CTG CAA GAT AAA ATT AAG AAT TTA AGA GAA GTG AGA GGA CAT CTG AAG AGA AGG AAG 2013 
p E E C S C S K Q S Y Y N K E K G V K K 291 
CCT GAG GAA TGT AGC TGC AGT AAA CAA AGC TAT TAC AAT AAA GAG AAA GGT GTA AAA AAG 2073 
QEKLKSHLHPFKEAAQEVDS 711 
CAA GAG AAA HA AAG AGC CAT CTT CAC CCA nC AAG GAG GCT GCT CAG GAA GTA GAT AGC 2133 
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FIG. 5C. 



KLQLFKENNRRRKKERKEKR 731 
AAA CTG CAA CTT HC AAG GAG AAC AAC CGT AGG AGG AAG AAG GAG AGG AAG GAG AAG AGA 2193 

RQRKGEECSL P n i T r. F T H D N 751 
CGG CAG AGG AAG 6G6 GAA GAG TGC AGC CTG CCT GGC CTC ACT T6C HC AC6 CAT GAC AAC 8253 

N H U Q T A P F V N L 6 S F C A C T S S 771 
AAC CAC TGG CAG ACA GCC CCG HC TGG AAC CTG GGA TCT TTC TGT GCT TGC ACG AGT TCT 2313 

NNNTYVCLRT V NFTHNFLFC 791 
AAC AAT AAC ACC TAC TGG TGT HG CGT ACA GTT AAT GAG ACG CAT AAT TH CTT TTC TGT 2373 

F F A T fi F I F Y F D M M T P P ^Y. Q L T 811 
GAG HT GCT ACT GGC TTT m GAG TAT TH GAT ATG AAT ACA GAT CCT TAT CAG CTC ACA 2433 

N T V H T V E R G I I NQL HVQLHE 831 
AAT ACA 6TG CAC ACG GTA GAA CGA GGC ATT HG AAT CAG CTA CAC GTA CAA CTA ATG GAG 2493 

I R S C Q n Y K Q C N P R P K N L D V G 851 
CTC AGA AGC TGT CAA GGA TAT AAG CAG TGC AAC CCA AGA CCT AAG AAT CTT GAT GTT GGA 2553 

N K D G G S Y D L H R 6 Q L y D G V E G 871 
AAT AAA GAT GGA GGA AGC TAT GAC CTA CAC AGA GGA CAG TTA TGG GAT GGA TGG GAA GOT 2613 

, 872 

TAA 2616 

TCAGCCCCGTCTCACTGCAGACATCAACTGGCAAGGCCTAGAGGAGCTACACAGTGTGAATGAAAACATCTATG^^^ 

AGACAAAACTACAGACnAGTCTGGTGGACTGGACTAATTACnGAAGGATHAGATAGAGTAnTGCACTGCTGAAGA 

GTCACTATGAGCAAAATAA/V^CAAATAAGACTCAAACTGCTCAAAGTGACGGGTTCTTGGTTGTCTCTGCT^ 

TGTGTCAATGGAGATGGCCTCTGCTGACTCAGATGAAGACCCAAGGCATAAGGHGGGAAAACACCTCAm 

CCAGCTGACCnCAAACCCTGCATnGAACCGACCAACAHAAGTCCAGAGAGTAAACnGAATGGAATAACGAC^^ 

CAGAAGHAATCATTTGAAHCTGAACACTGGAGAAAAACCGAAAAATGGACGGGGCATGAAGAGACTAATCATCTa^ 

AACCGATnCAGTGGCGATGGCATGACAGAQCTAGAGCTCGGGCCCAGCCCCAGGCTGCAGCCCAHCGCA^ 

AAAGAACnCCCCAGTATGGTGGTCCTGGAAAGGACAHTHGAAGATCAACTATATCnCCTGTGCAnCCGATGGAA 

TTTCAGnCATCAGATGHCACCATGGCCACCGCAGAACACCGAAGTAATTCCAGCATAGCGGGGAAGATGTTGA^^ 

GGTGGAGAAGAATCACGAAAAGGAGAAGTCACAGCACCTAGAAQGCAGCGCCTCCTCnCACTCTCCTCTGATTAGA^^ 

AAACTCTTACCCnACCTAAACACAGTATnCTTTUAACnnTTATnGTAAACTAATAAAGGKAATCACAGCD^^ 

AACATTCCAAGCTACCCTGGGTACCTHGTGCAGTAGAAGCTAGTGAGCATGTGAGCAAGCGGTGTGCACACGGA^ 

CATCGnATAATTTACTATCTGCCAAGGAGTAGAAAGAAAQGCTGGGGATATnGGGTTGGCTnGGKTTTGATTTTTT 

GCnGGHGGTTGGTnGKACTAAAACAGTAnATCTnTGAATATCGTAGGGACATAARKyVW^^ 

YHRAKAKGSYURRAMKGGGSTYTYTSKKRKSTMVAmYKySCMCCYOT 

KTAATGAAGH 
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Prosite Pattern Matches for 23553 
Prosite versions: Release 12i of Febniaiy 1995 

>PS00001/ PDO(X)(XX)1/ASNjSLY(X>SYlATO^ N-glyoosylaiion site. 

Queiy: 64 NKTR 67 

Query. Ill NCSS 114 

Queiy. 131 NNTG 134 

Query: 148 NGSY 151 

Query: 170 NYTV 173 

Query: 197 NESI 200 

Query: 240 NASQ 243 

Query: 623 NDSI 626 

Query: 773 NNTY 776 

Query: 783 NETH 786 

>PS00005/ PIX)C00005/PKC_PHOSPHO_SiTE Protein kinase C ptwspfiocylation site. 



Query: 24 


TVR 


26 


Query: 27 


SPR 


29 


Query: 66 


TRK 


68 


Query: 96 


TGK 


98 


Query: 206 


SKR 


208 


Query: 400 


TNK 


402 


Query: 425 


SSK 


427 


Query: 468 


SGK 


470 


Query: 484 


TVR 


486 


Query: 488 


STR 


490 


Query: 505 


SCR 


507 


Query: 516 


SQR 


518 


Query: 520 


SQR 


522 


Query: 530 


TPK 


532 


Query: 611 


WR 


613 


Query: 615 


THK 


617 


Query: 635 


SAR 


637 



>PS00006/ PDOC00006/CK2_PHOSPHO_SITE Casein Idnase II ptiosplxNyiation site. 

Query: 107 TNNE 110 

Query: 288 SVDO 291 

Query: 367 TILD 370 

Query: 376 TPPD 379 

Query: 452 TACE 455 

Query: 505 SCRE 508 

Query: 781 TVNE 784 



FIG.8A. 
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>PS00()07/ PD(XX)()007/ryR_PH()SPHO_SrFE Tyrosine idnase phosptKxylation site. 
Query. 637 RAWKDHKAY 645 
>PS00008/ PDOC00008ifMYRISTYL N^nyrisbylation site. 

Queiy: 19 GSLCST 24 
Queiy: 161 GUKNS 166 
Query: 325 6LVK6K 330 
Query: S92 G6NRGR597 
Query: 763 GSFCAC 768 
Query: 851 GNKDGG 856 

>PS00523/ PDOC00117/SULFATASE_1 Sulfalases signature 1. p_.-^ 

Query: 85 PMCCP8RSSMLT6 97 HCj. OP. 
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Input f i le Fbh25278FLl. seqj Output Fi le 25278. trans FIG. 10A. 
Sequence length 2940 

CCA(mT(XGCCCACGCGT(XGGCTKCACOCCGCGTCTCAGGCTGG(XGGGCT^^ 

CGGCGCAGGGCCTGCGCnAGGCAGCGGGAGGCAGCTOiGCGCGGGCCT^ 

GCAGATaGGCCCAGCCGTCCGGCAGCCAGTCCCGGACCAGACACTGGACCGTCCaGGGGGCGC 

AGCATCCGAGCC6GCGGGCCGGTGGTGCGCCCTGGGCGCGCGAGGTGGTGAGGCCCCAGGAGCCCGGCGCGCCG6GACA 

MHTLTGFSLVSLLSF 15 

CGCGGGCCGGCTTGGCG ATG CAC ACC CTC ACT GGC TTC TCT CTG GTC AGC CTG CTC AGC TTC 45 

GriSyDVAKP SFVADGPGEA 35 

GGC TAC CTG TCC TGG GAC TGG GCC AAG CCG AGC HC GTG GCC GAC GGG CCC GGG GAG GCT 105 

GEQPSAAPPQPPHI I F^.I LTD 55 

GGC GAG CAG CCC TCG GCC GCT CCG CCC CAG CCT CCC CAC ATC ATC nc ATC CTC ACG GAC 165 

DQGY HDVGYHGSBIETPTLD 75 

GAC CAA GGC TAC CAC GAC GTG GGC TAC CAT GGT TCA GAT ATC GAG ACC CCT ACG CTG GAC 225 

R L A A K G V K L E N Y Y I Q P I C T P 95 

AGG CTG GCG GCC AAG GGG GTC AAG TTG GAG AAT TAT TAC ATC CAG CCC ATC TGC ACG CCT 285 

S RSQLLTGRYQIHTGLQH S I 115 

TCG CGG AGC CAG CTC CTC ACT GGC AGG TAC CAG ATC CAC ACA GGA CTC CAG CAT TCC ATC 345 

IRPQQPNCLPLDQVTL PQKL 135 

ATC CGC CCA CAG CAG CCC AAC TGC CTG CCC CTG GAC CAG GTG ACA CTG CCA CAG AAG CTG 405 

Q E A G Y S THMVGKWHLG F Y R K 155 

CAG GAG GCA GGT TAT TCC ACC CAT ATG GTG GGC AAG TGG CAC CTG GGC TTC TAC CGG AAG 465 

E C L P T R R G F D T F L G S L T G N V 175 

GAG TGT CTG CCC ACC CGT CGG GGC nC GAC ACC HC CTG GGC TCG CTC ACG GGC AAT GTG 525 

DYYTYDNCDGPGVCGFDLHE 195 

GAC TAT TAC ACC TAT GAC AAC TGT GAT GGC CCA GGC GTG TGC GGC HC GAC CTG CAC GAG 585 

GENVAWGLSGQYSTMLYAQR 215 

GGT GAG AAT GTG GCC TGG GGG CTC AGC GGC CAG TAC TCC ACT ATG CTT TAC GCC CAG CGC 645 

ASHILASHSPQRPL F L Y V A F 235 

GCC AGC CAT ATC CTG GCC AGC CAC AGC CCT CAG CGT CCC CTC TTC CTC TAT GTG GCC HC 705 

QAVHTPLQSPREYLYRYRTH 255 

CAG GCA GTA CAC ACA CCC CTG CAG TCC CCT CGT GAG TAC CTG TAC CGC TAC CGC ACC ATG 765 

GNVARRKYAAMVTC M D E A V R 275 

GGC AAT GTG GCC CGG CGG AAG TAC GCG GCC ATG GTG ACC TGC ATG GAT GAG GCT GTG CGC 825 

NITWALKRY GFYNNSVIIFS 295 

AAC ATC ACC TGG GCC CTC AAG CGC TAC GGT TTC TAC AAC AAC AGT GTC ATC ATC TTC TCC 885 

S D N G G Q T F S G G S NWPLRGRK 315 

AGT GAC AAT GGT GGC CAG ACT TTC TCG GGG GGC AGC AAC TGG CCG CTC CGA GGA CGC AAG 945 

G T Y W E G G V R G L G F V H S P L L K 335 

GGC ACT TAT TGG GAA GGT GGC GTG CGG GGC CTA GGC TTT GTC CAC AGT CCC CTG CTC AAG 1005 
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R K Q R T S R A L M H I T D y Y P T L V 355 

aA AAG CAA CGG ACA AGC CG6 GCA CTG ATG CAC ATC ACT GAC TGG TAC CCG ACC CTG GTG 1065 

G L A G G T T S A A D G L D G Y D V . W P 375 

GGT CTG GCA GGT GGT ACC ACC TCA GCA GCC GAT GGG CTA GAT GGC TAC GAC GTG TGG CCG 1125 

A I S E 6 R A S P R T E I I H N I D P L 395 

GCC ATC AGC GAG GGC CGG GCC TCA CCA CGC ACG GAG ATC CTG CAC AAC ATT GAC CCA CTC 1185 

Y N H A Q H G S L E G G F G I V N T A V 415 
TAC AAC CAT GK CAG CAT GGC TCC CTG GAG GGC GGC TTT GGC ATC TGG AAC ACC GCC GTG 1245 

Q A A I R V G E U K L L T G D P 6 Y G D 435 

CAG GCT GCC ATC CGC GTG GGT GAG TGG AAG CTG CTG ACA GGA GAC CCCUGpC TAT GGC GAT 1305 

V I P P Q T L A T F P G S V W N L E R M 455 
TGG ATC CCA CCG CAG ACA CTG GCC ACC TTC CCG GGT AGC TGG TGG AAC CTG GAA CGA ATG 1365 

A S V R Q A V V L F N I S A D P Y E R E 475 

GCC AGT GTC CGC CAG GCC GTG TGG CTC TTC AAC ATC AGT GCT GAC CCT TAT GAA CGG GAG 1425 

DLAGQRPDVVRTLLARLAEY 495 

GAC CTG GCT GGC CAG CGG CCT GAT GTG GTC CGC ACC CTG CTG GCT CGC CTG GCC GAA TAT 1485 

NRTAIPYRYPAENPRAHPDF 515 

AAC CGC ACA GCC ATC CCG GTA CGC TAC CCA GCT GAG AAC CCC CGG GCT CAT CCT GAC HT 1545 

N G n A V G P V A S D E E E E E E E G R ^ 

AAT GGG GGT GCT TGG GGG CCC TGG GCC AGT GAT GAG GAA GAG GAG GAA GAG GAA GGG AGG 1605 

ARSFSRGRRKKKCKICKLRS ^ 

GCT CGA AGC HC TCC CGG GGT C6T CGC AAG AAA AAA TGC AAG ATT TGC AAG CH CGA TCC 1665 

F F R K L N T R L M S Q R I 1 ,570 

TH no C6T AAA CfC AAC ACC AGG CTA ATG TCC CAA CGG ATC TGA 1710 

TGGTGGGGAGGGAGAAAACTGTCCTnAGAGGATCnCCCCACTCCGQCTTGGCCCTGCTGTnCTCAGQGAGAAGCCT 

GTCACATCTCCATCTACAGGGAGnGGAGGGTGTAGAGTCCCnGGnGAACAGGGTAGGGAQCCTGGATAGGAG^ 

TGGGAATAAACCAGACTGGGATGCCTGTGTCTCAGTCCTGCCTCCTCACGGACnGCTCTGTGACCT 

ATGAQCTTnAGCCTCAGTnCCTCATCTGTAAAATGAGCTCTAATGACTnGTGACTCTnOGTGTGQCCCTGGAGCC 

TGGGQCCACG6TGGAGna:TGGCCGGCCnGCCACnGACAACTCCTnAAGGCnCCCCCnAACACGGGATC^^^ 

TGGTGGTGTnCGGAGnGCCTGGABGCAACTCCAAGCCTGGCCCCCAGCTGAAGCATGQCAATCTGGCTGCTCTC^^ 

AQGGACCCCCAAGCGCTGTGGGTGGAGGGCAGGGGTCGGGGQGGnGACCncnGQGTCTTCACATGQCCTAGQC^ 

TCCTCCGGTCAGACTGGTGTCAGGCACCGTGGTGCAAAATTCCTCnCTGGCCCCTCCAGTACCCAGAGAAACTOa^^ 

GGCCAHAACTGCTGCAGCACCAAGQGTGGTAGAAAGAGCTGTGAAGAGCCCCCAAACCAGTACCAGG^ 

CTCCTGTGACCTGGGGCACAGncnGCCCTCTAGGCCTTGAHTCCCCACCTGCAAGTGGGGATGCCAGCCCTO^^^ 

TGCCTCCnCATGAGGCTCTGGAAGACTGGCCAAGGTTGTGGAGGAGCnGTGAACnGAUAAAGTGTCGTAACATGG 

AAAAAAAAAAAAAAAAAAAAAAGGGCGG 

FIG. 10B. 
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FIG. 13. 

Prosite Pattern Matches for 25278 
l^vasiois Release 122 of Febiuav 19^ 
iPmmWXmm avOOSyiAIKmN^osylailonsite. 
Queiy: 276 NIIW 279 
Queiy:288 NNSV 291 
Quety: 466 NISA 469 
Queiy:496 NRTA 499 

Queiy: 314 RK6T 317 

>reogoi!SPixxx)oo^ ^- 

Qmf m T6R 104 

Query: 160 TRR 162 

Queiy. 244 SPR 246 

Query: 340 TSR 342 

Query: 383 SPR 385 

Query: 457 SVR 459 

Query: 566 SQR 568 

>^0006/PD()C()()()06mjH^ 

Query: 67 SDIE 70 

Query: 244 SPRE 247 

Query: 268 TCMD 271 

Query: 317 TYWE 320 

Query: 363 SAAD 366 

Query: S25 SDEE S28 

>PS(l(1007/ imO(l(l(Bmj^ 

Query: 134 KLQEA6n40 

>re00pPDO(»()l)(l(VM^ 

Query: 110 GIQHS1 115 
Query: 169 6SLT6N 174 
Query: 205 GQYSTM210 
Query: 300 6QTFS6 305 
Query: 321 G6VR6L326 
Query: 356 GIA6GT361 
Query: 402 6SLE66 407 
Query: 409 GIWNTA 414 
Query: 447 6SWWNL4S2 

>PS()(1009lim()IMAMlDMlONArnklal^ 



Query: 312 RGRK 315 
Query: 541 R6RR 544 

>PS00149I PDOC00117/SULfATASE 2Siiilasessignaluie2 
Query: 139 GYS1HMV6KW ~148 
>PS00523/ PDOO00117/SUlfATASE_1 SuKatasessignaiiOB 1. 
Query: 91 PICTPSRSQllTG 103 
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26212 seqs 

m Sequence (n t 7nft-gll8 codlno) 

'ScATOiTQmeTGAGAGGAGAGGAGGATncnGCCM^^ 
GCQBCGCGGGimGTGGnCTCCaXiTGGACT^ 

TTGATAaTTTmantCaTmOiGAAGTG^ 

ASffiimS™™ 

Q^TCCAHATIMIATAAACAGGAKAGATATGCTTO^^ 

ScttSStSSSt^ 

TCTGOiAiamTAAGTGAGGGTCnCKTCACCCCG^ 

CCTGGTOCAGGCTATGGGATCTGGAACACTimTCCA^^^ 

ACAGCGACTGGGTCCCCCCTCACTCmCAGC^ 

TATGGCTmCAACATCACAGCCGACttATATG^^ 

TCTCAeAGnCAACAAAACTa:AGTGCCG6TC^^ 

CATGGTATAAAGAQiywmAGAAAAAGAAGCCAAGCAAAAATC^ 

AGCAGAAAGCAGTCTCAGGnCAACnGCCAnCAQGTGn^^^ 

TCnATCmCATCTGmttTAGGTAAACCAG(M\m(KK:TC6ATAATATCa^ 

CAC 

Protein sequence 

MAPRfiPiMMWPSPQACVCPGKMLAHGAI^ 
RSNPilNGGVVGPVYKEETKKIGa'SKNQAE^ 

FIG. 15. 
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Proslte Pattern Matches for 26212 
Pn)sibwisio(cRelease122ofFelnB^ 

>remPDOO()ooovASN_aY(X)6nAiioN 

Queiy: 157 NATL 160 

Queiy: 306 NVTL 309 

Queiy: 318 NNSI 321 

Queiy. 431 N6SW 434 

Query: 497 NHA 500 

Queiy: S27 NKTA 530 

>^0004/POO(X)»^^ 

Queiy: 521 RRLS 524 
Queiy: 562 KKPS 565 

>^00fl6IPDOO00(X^ 

Queiy: 131 TGK 133 

Queiy: 189 TRR 191 

Queiy: 243 TOR 245 

Queiy: 413 SPR 415 

Queiy: 489 TGK 491 

Queiy: 509 SNR 511 

Queiy: 5^ TKK 561 

Queiy: 576 SKK 578 

>reoooom)ooo()^^ 

FIG. 18A. 
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Queiy: 298 SCUD 301 

Queiy: 347 TYWE 350 

Queiy. 386 SLAE 389 

Queiy: 406 USE 409 

>^00g7/PO(Xa0(»7/IYRJ1^^ 

Queiy: 163 KLKEVGY 169 
>^0OP>D(XX)O()O8/MYRiSmN^ 



Queiy: 28 GALAGF 33 

Queiy: 56 GAIJLAQ 61 

Queiy: 139 aOHS1 144 

Queiy: 198 6SU.GS 203 

Queiy: 235 GIYSTQ 240 

Queiy: 329 6GQPTA 334 

Queiy: 343 GSKGTY 348 

Queiy: 351 GGIRAV 358 

Queiy: 432 GSWAAG437 

Queiy: 439 6iWNrA444 

>PS00149( ITO000117/SULFATASEjSi«Bb«^ 
Queiy: 168 6YS1HMVGKW 177 
> PS00523/ PDOC00117/SUlfATASE_1 Siiedases signature 1. 
Queiy: 120 PICTPSRSQFIT6 132 

FIG. 18B. 
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Allgnnents of top-scoring donalnsi 

Sulfatasei donaJn 1 of 1, fron 36 "to 462i score 323.0, E = 3.5e-93 

i->PNl III laDDlGlodlGcyGnptlrtpnlDrLAeeGlrFtnayvttp 
PN+++i laDlHG+QdlG+ + t t n+D +A+eG+rF ++ ++++ 
25277 36 PNI^IILADDHGyGiGANVAETKDTAOKMASEGMRI^DnW^ 82 

ICiPSRAalLTGRyphrtGnytnnragvlpftgwsleGgJpldetJJpel 
C+PSRA+ILTGR+ r+6++ n + +s +G9lpl+ettl+e+ 
25277 83 TCSPSRASaTGRLGLRNGVTRNFAV TS-VGGLPNETTLAEV 124 

LkeaGYaTonvGKVHlgyneessosdfohlPlgrGFdyfygnlGGEdQyY 
L++aGY+l9++GKyHl9++ ++ +P rGF<ft'++9 
25277 125 LQQAGYVTGIIGKMHLGHHGSY HPMFRGFDYYF6 158 

plvdaUpftndtytceggygfskdvalkplgalgvnevieapdkaladyk 

25277 159 IPYSH-DMGCT D 169 

iaga Invphhvf Eyadryagavdvgrpf lav 1 1 f prpaacf lypnatws 
t+o+ + p + ++++ +r + ++ + a+ ly n+ +V+ 

25277 170 TPGYNGPP — -CPACPQGBGPSRNLQRDCY-TDVALPLYENLNIVE 211 

qpnphsp ItoPrpiq I ladea Ipf lerngqrdkpf f ly Isykhvh I prda 
qp s 1+ q ♦a++Q +f++r+ + +pf+ly++++h+h+p 
25277 212 QPVNLSSLA QIOrAEKATQFIQRASTSGRPRiYVALAHMHVP-- 253 

pnlfsskdfagssrrglYgl I IDsveenDdgvgrvlnoLdelNGUdnTl 
1+ + a r lYg + + enD +vg++ + +d + +nT+ 
25277 254 -4PVTQIJ'AAPRGRSLYG— AGLyEMDSLVGQIia)l(^ 296 

I iFTSllDhGghlgahghlglragGsngpfrg. gKgtnlye 

FT D4G+ ++ + + Gs gpf g ++++++mK+-t+ +e 
25277 297 LVFTG-MPVAQKCELA GSVGPFTGf f qtrqqqspAKQH-VE 338 

gGtRvpl ivrwPeGl lopgqvsdelvslnDlfPTi IdLAGoplPgvaQgv 
gG+RvP++++wP G+ + + +s +1 s++D+fPT+++LA a+lP 
25277 339 GGHRVPALAYVP-GRVPVNVTSTALLSVLDIFPTWALAQASLP 381 

kdrl IDGvsLlplLlgaagssrhetlfyesycnegrgf Ipavrwgkkkah 
+ r DGv++ ++L g+ +++h lf++ n g a++ + 
25277 382 QGRRFDGVDVSEVLFGR-SQP6HRVLFHP— NSG AAGEFGALQT 422 

f rtpn I agw qrvdf ddvwk If ntvedf nrsgddacrhgdvckc Igkprrs 
+r + + k+f ++++ ++++ g+ + + 
25277 423 VRLE RYKAFYITGGAR— ACDGSTGPaOHKF 452 

vW)dppllydlsrDP<-i 

plU+D FIG 19 

25277 453 ~ PLIFNLEDDT 462 iw. i^. 



// 
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AUgnnents of tpp-scoring donalnsi 

Sulfatasei dona In 1 of 1, fron 43 to 467t score 268.9, E = 6.5e-77 

i->PN I U I laDD IG I gd IGcyGnpi I rtpn I DrLAeeG IrF "tnoyvttp 
PNI+1+1+DD++ ++IG+ ++ ♦+ + +6 F na+vtto 
23553 43 PNIILVLTDDQ^VaGSLI^--YMNKTRKIMEHQGATFIMAFVTTP 85 

ICtPSRAQ ILTGRyphrtGnytnnrogv Ipf tgws leGg Ip Idett Ipe I 
+C+PSR++ LTG+y h++++ytnn++ ++++ w+ ++ 
23553 86 M(XPSRSSIlTGiaVIMMYTNNEN--CSSPSyQ AMHEPRTFAVY 129 

LkeaGYoTgnvGKVH Igyneessasdf oh IP IgrG Fdyf ygn IGEdQW 
L + GY+T+++GK++++yn ++ +P+g+ ++ +n 
23553 130 LNNTGYRTAFFGKYLNEYNGSY IPPGWReVLaiKN 165 

Yp I vda I Ipf tndtytceggygf skdva Ikp Iga Igyneveapdka lady 
++f+n + c++g + ++++ ++++ dy 
23553 SRFYN-YTVCRNG IKEXHGFBYAK DY 190 

ktoga Invphhvf EWQdryagavdvgrpf lov 1 1 f prpaacf lypnatw 
+t++++n + y++++ P++++++ + 
23553 191 FTDLITNES INYFKMSK RMYPHRPVMMV 1 219 

sqpnphsp ItoPrpwq I ladea Ipf lerngqrdkpf f ly Isykhvh 1 prd 
s+ +ph p + + ++++ + p+ + + + + +++ •H<h+ ++++ 
23553 220 SHAAPHGPED-S-~APQFSKLYPNASQH-ITPSYNYAPNMDKHyiMQYT 264 

opn If sskdf agssrrg I Yg 1 1 IDsveenDdgvgrv InaLde ING I IdnT 
+pnU + +f+ ++r++ + +++++Dd+v+r++n L e G+l+nT 
23553 265 GPMLPIHMEFTNILQRKRLQ TLMSVDDSVERLYNMLVET-GELENT 309 

1 11 FTS I IDhGgh Igahgh Ig I ragGsngpf rggKgtn lyegGiRvP 1 1 v 
+M+T+ DHG+h+g++g+ + gK+++ y++++RvP+++ 
23553 310 YIIYTA-DHGYKIGQFGGLV K-GK»1P-YDFDIRVPFFI 344 

rwPeGl lapgqvsdelvslnDlfPTl IdLAGaplPgvaagvkdrl IDGvs 
r+P +pg+++ ++V ++DI+PTI ld+AG++ P +DG+S 
23553 345 R6P-SVEPGSIVPQIVLNIDLAPTILDIAGLDTP PDVDGKS 384 

Hp IL Igaagssrhet I f yesycnegrgf Ipavrwgkkkahf rtpn I agi 
+1+IL+ + ++ ++f + + +++ +++ + + +f 
23553 385 VLKLLDPE KPGNRFRT-NKKAK— IWRDFLYERGKF 418 

qrvdfddvwklfntvedfnrsgddacrhgdvckclgkprrsvthhdppll 
+ k + + + ++S++ + + + +c ++++ ++ +++P + 
23553 LRKKEESSKNIQQSNHLPKYERVKELCQQARYQTA-CEQPGQK 460 

ydlsrDP<-x 

FIG 20 

23553 461 UQCIEDT 467 ' 
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Alignments of top-scoring donalnsi 

Sulfatase: dona In 1 of 1, fron 47 to 471: score 289. 7, E = 3.6e-83 

i->PN ll ll laDD IG I gd IGcyGnpt I rtpn i DrLAeeG IrFtnayvttp 
P+I++II+DD+6+ d+6 +G + l+tp++DrLA+ G+++ n y+ +p 
25278 47 PHIIFILTDDQ6YHDVGYHG-SDIETPTLDRLAAKGVKLEN-YYIQP 91 



25278 



ICtPSRAa ILTGRyphrtGnytnnrogv Ipf tgws leGg Ip Idett Ipe I 
+CtPSR++lLTGRy+++tG+++ + p+++ +lpld +tlp+ 
92 ICTPSRSQLLTGRYQIHTaQHSIIR™PQQPN CLPLDQVTLPQK 134 



25278 



LkeaGYQTgnvGKyHlgyneessasdfahlPlgrGFdyfygnlGGEdQVY 
L+eaGY T+nvGKVHlg +++++ lP++rGFd+f+g+ 
135 LQEAGYSTHMV6KVHLGFYRKEC LPTRRGFDTaGS 170 



25278 171 



p (vda I Ipf tndty tceggygf skdva Ikp Iga Igvneveapdka ladyk 
I + d+yt+++ ++ 
LTGNVDYYTYDN CD 184 



25278 185 



25278 203 



25278 245 



25278 



25278 330 



25278 373 



25278 418 



taga InvphhvfEVadryagQvdvgrpf lav ll f prpaacf lypnatws 
+9+ + +d + + +++ 

GPGVCG FD- LHEGENVAVG 



202 



qpnphsp I taPrpwq I ladea Ipf lerngqrdkpf f ly Isykhvh I prd 
++S++ +a++a I ++ +p f ly+++++vh+p ++ 
LSGQYSTML YAQRASHILASH-SPQRPLFLYVAFQAVHTPLQs 244 

apnlfsskdfagssrrglYgl I IDsveenDdgvgrvlnaLdelNGl IdnT 
+ +++ ++ g+ r+ Y+ ++V nD++v ++ aL++ G ++n 
PREYLYRYRTMGNVARRKYA—AMVTCMieVRNITWALKRYHJFYW^ 290 



U IFTSI IDhGghlgahghlglragGsngpfrggKgtnlyegGtRvPl Iv 
+HF+S D+Gg++ gGsn+p+rg+Kgt +egG+R ++v 
291 YIIFSS-DNGQQTF S GGSNVPLRGRKGn-WEGGVRGLGFY 329 



rwPeGl lopggvsdelvslnDlfPTl IdLAGoplPgvoagvkdrl IDGvs 
+4P +++ ++S++1 ++ D++PT++ LAG++ + IDG++ 

HSP-LLKRKQRTSRALMHITDWYPTLVGLAGGTTS AADGLDGYD 372 

L IplL Igaagssrhet If ye sycnegrgf Ipavrwgkkkahfrt 

++P++ ++ +s+r e+l+++++ ++ +++ g + g ++ + + 
WPAISEGRASPRTEILHNIdplynhAQHGSLEG GFGIWNTAVQA 417 



pn I . agwqrvdf ddvwk If ntvedf nrsgddacrhgdvckc Igkprrsvt 
+ + w + + ++d+ +++ 0 + g + + ++ 
AIRvGEUK aiGDPGYGDVIPPQTLATFPGSVUNLER MAS 



457 



25278 



hhdppllydlsrDP<-i 
+ 1+++S+DP 
458 VRQAWLFNISADP 471 
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Alignments of top-scoring donalns« 

Sulfatase* dona In 1 of U fron 76 to 502i score 324.5, E = 1. 3e-93 

i->PNvU I loDDlGlgdlgcyghptlrPnldrLAeeGlrFtnhytatp 
P+ ++IIqDD+G+ d+9++g ++I TP+ld+LA+eG+++ n+y+ +p 
26212 76 PHLIFILADDQGmVGYHG-SEIKPTLDKLAAEGVKLENYYV-QP 120 

ICsPSRAoL ITGryphrhGnvsngr Igv Igf taksgg Ip IdettLpe ILk 
+C+PSR+++ TG+y++++G + + + ++ +lpld +tLp+ Lk 
26212 121 ICTPSRSQFITGKYQIHTGLQH SIIRPTQPNCLPLDNATLPQKLK 165 

eaGYaTglvGKVHlglnensdaagdgehlPlgwrGfdyfdgflygspfty 
e GY T++VGKVHI9+++ +e+ P++ rGfd f+g l+gs ++y 
26212 166 EVGYSTHMVGKWHLGFYR KECMPTR-RGFDTFFGSLLGSGDYY 207 

deencdngeoteppeaypeagw Ipq I Igyy ltd I ladka Ig I Idvasaag 
++ cd +P+ aa 
26212 208 THYKCD SPGM C6YDLYENDNAA- 229 

r I laka laosrPF ly I sppaphf s i If mf kevaqpyrapq I tq If vde 

++++ + ++tq+++++ 

26212 230 WDYD NGIYSTQMYTQR 245 

ttQdf I emk. ekPf f ly lof Ir Ihvhtp If spaed leskdf IgrsqrgrY 
++++++ kP f ly Q++ +vh pl++p + e+++ r+rY 
26212 246 VQQILASHNpTKPIFLYIAYQ-AVHSPLQAPGRYFEHYRSIININRRRY 293 

gd IveenDdlvGrv IdaLed 16 1 IdNT Iv i f TSDnGah legtpewygggn 
+++++ D+++++V oL+ G ++N ++l++SDnG g+p+ *gsH\ 
26212 294 AAMLSCLDEAINNVTLAUOYGFYNNSIIIYSSDNG GQPT-AGGSN 338 



26212 339 



9P IkogKoygs lyeGg i RvP I IvriPgg I opagrvkekse IvshvD loPT 
+pl+flKQ+ +eGQlR ++V++P ++g+v+ elv++ D++PT 
Vm&KGn~VEGGIRAVGFVHSP-LIJ(NKGTVCK--ELVHIT^ 383 
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+ +IA + ++ d IDG++++ + +g + s+ + +++h+ 
26212 384 LISLAEGQIDE DIQLDGYDIVETISEGLR-SP-RVDILHN— 421 

ork Irovrwprksgktpk Ikahf f f tpat 

+++++!<+ + + a + ++ ++++++ +++++ ++ 
26212 422 „IDPIYTKAKN™6SVAAGYGiyNTalgsalrvqhwklltgnp9ysd «5 

.... dddtnngwecvgtvsqadd I edcrcegvetvthhdppe lyD IsrDP 
++++ n+g + ++ e t+ + +1++ ++DP 

26212 466 wvppQSFSNLG PNRWHNER-ITLSTGKSVyUNITADP 502 
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